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Summary 

One of the outstanding problems in data assimilation has been and continues to be how 
best to utilize satellite data while balancing the tradeoff between accuracy and computational 
cost. A number of weather prediction centers have recently achieved remarkable success in 
improving their forecast skill by changing the method by which satellite data are assimilated 
into the forecast model from the traditional approach of assimilating retrievals to the di- 
rect assimilation of radiances in a variational framework. The operational implementation of 
such a substantial change in methodology involves a great number of technical details, e.g., 
pertaining to quality control procedures, systematic error correction techniques, and tuning 
of the statistical parameters in the analysis algorithm. Although there are clear theoretical 
advantages to the direct radiance assimilation approach, it is not obvious at all to what 
extent the improvements that have been obtained so far can be attributed to the change in 
methodology, or to various technical aspects of the implementation. The issue is of interest 
because retrieval assimilation retains many practical and logistical advantages which may 
become even more significant in the near future when increasingly high-volume data sources 
become available. 

The central question we address here is: how much improvement can we expect from as- 
similating radiances rather than retrievals, all other things being equal? We compare the two 
approaches in a simplified one-dimensional theoretical framework, in which problems related 
to quality control and systematic error correction are conveniently absent. By assuming a 
perfect radiative transfer model and perfect knowledge of radiance and background error 
covariances, we are able to formulate a nonlinear local error analysis for each assimilation 
method. Direct radiance assimilation is optimal in this idealized context, while the tradi- 
tional method of assimilating retrievals is suboptimal because it ignores the cross-covariances 
between background errors and retrieval errors. We show that interactive retrieval assimila- 
tion (where the same background used for assimilation is also used in the retrieval step) is 
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equivalent to direct assimilation of radiances with suboptimal analysis weights. By examining 
the weights in different scenarios, e.g., when the dependence of the retrieval on background 
information varies, we are able to conclude that the effect of neglecting the cross-covariances 
in retrieval assimilation is potentially most harmful for vertical modes for which the infor- 
mation content of the background roughly balances the information content of the radiance 
data. 

We illustrate and extend these theoretical arguments with several one-dimensional as- 
similation experiments, where we estimate vertical atmospheric profiles using simulated data 
from both the High-resolution InfraRed Sounder 2 (HIRS2) and the future Atmospheric In- 
fraRed Sounder (AIRS). The improvement in analysis accuracy obtained by directly assimi- 
lating the radiance data, rather than interactively retrieved profiles, is generally small in our 
experiments. In case of non-interactive retrievals the results depend very much on the qual- 
ity of the background information used for the retrieval step. In all cases, the impact of the 
choice of assimilation method is dwarfed by the effect of changing some of the experimental 
parameters that control the simulated error characteristics of the data and the background. 
In practice, of course, the uncertainties in many of these parameters are considerable, since 
radiative transfer models are far from perfect, and radiance and background error covari- 
ances are not accurately known. These issues affect all assimilation methods and must be 
dealt with in details of implementation, which will then ultimately determine the quality of 
the assimilation products. 


1. Introduction 

A data assimilation system (DAS) estimates the state of the atmosphere by combining 
different types of atmospheric observations with a short-term model forecast (often referred 
to as the first-guess or background field). Assimilated data types include, for example, in 
situ measurements of temperature, moisture, and wind, obtained from radiosonde soundings. 
Such conventional observations have a high vertical resolution but their geographical coverage 
is mostly limited to land areas in the northern hemisphere. Satellite observations, on the 
other hand, provide a more uniform spatial coverage but are hampered by a relatively poor 
vertical resolution. This stems from the fact that the satellite-borne instruments measure 
quantities that are functionals of the atmospheric state variables, such as radiances emitted in 
certain spectral bands, or integrals of atmospheric refractivity, rather than the state variables 
themselves. 

Two basic approaches have been used to incorporate measurements from remote sound- 
ing instruments, such as the TIROS vertical operational sounder (TOYS), in data assimila- 
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tion systems: (1) Assimilate radiances (either clear, cloudy, or cloud-cleared to remove the 
effects of cloud) directly; (2) Assimilate geophysical products (retrievals) obtained from the 
observed radiances. Several operational NVVP centers have recently moved from the more 
traditional approach of assimilating retrieved products to radiance assimilation using a vari- 
ational approach ( e.g Andersson et al. 1994, 1998; Derber and Wu 1998). There are strong 
indications that the implementation of direct radiance assimilation at the National Centers 
for Environmental Prediction (NCEP) has resulted in a large positive impact on forecast 
skill, both in the northern and southern hemispheres (Derber and Wu 1998). However, a 
number of changes were introduced simultaneously to the NCEP DAS, including improve- 
ments in quality control and systematic error correction algorithms. It would be extremely 
interesting to study the performance of various assimilation techniques by means of a con- 
trolled set of experiments using a fixed DAS and a single, quality-controlled input data set 
with a fixed systematic error correction scheme. G. Paul (private communication , 1997) has 
shown that the assimilation of TOVS retrievals can be dramatically improved with rigorous 
quality control and that the impact of quality-controlled retrievals can be comparable to 
that obtained with radiance assimilation. 

The shift toward radiance assimilation has resulted in part from theoretical work by Eyre 
et al. (1993), who argued that assimilation of retrieved products amounts to a suboptimal 
use of the data. Retrievals are produced by combining observations with a prior estimate of 
the state of the atmosphere, possibly obtained from a forecast model, from climatological 
data, or from a data base of physically feasible vertical profiles. By assimilating the retrievals 
rather than the radiances into a DAS, additional information from the prior estimate will 
enter the system along with the measurement information. Errors in retrievals partly depend 
on the errors in the prior estimate used to produce them, and it is reasonable to expect that 
the latter are correlated with the errors in the background field used for the assimilation. The 
resulting cross-covariances between retrieval and background errors are not easily quantified 
and usually ignored in the assimilation. Clearly, if the retrieval strongly depends on prior 
information, and if the retrieval errors are misrepresented in the assimilation system, then 
the assimilation will be suboptimal. 

In selecting an appropriate assimilation method, computational and other practical is- 
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sues must be considered as well. Even if radiance assimilation is more desirable from a the- 
oretical point of view, the computational cost of assimilating retrievals can be significantly 
less. This is especially pertinent for advanced sounding instruments such as the Atmospheric 
InfraRed Sounder (AIRS), which will fly on NASA’s Earth Observing System PM Platform, 
and the Infrared Atmospheric Sounding Interferometer (IASI), to fly on the European Mete- 
orological Satellite (EUMETSAT) Polar System. These instruments have one or two orders 
of magnitude more spectral channels available than TO VS. Because of this dramatic increase 
in data volume, computational costs and simplified logistics may ultimately be the decisive 
factors in choosing an appropriate assimilation strategy for these instruments. A dedicated 
science team has been formed for the AIRS instrument whose task in part is to produce 
high-quality retrieved products that could be used for data assimilation. Combining the ex- 
perience, expertise, and algorithm development of data assimilation centers and instrument 
teams would be highly beneficial to both groups. 

In Joiner and da Silva (1998), referred to as Part I in this article, we explored various 
alternatives to radiance assimilation, with an eye toward the assimilation of future data from 
advanced sounding instruments. For data assimilation systems such as the Physical-space 
Statistical Analysis System (PSAS) that has been developed at the NASA Goddard Data 
Assimilation Office (DAO), the computational cost goes up dramatically as the number of 
observations increases. Therefore, we focused in Part I on methods to compress the radi- 
ance information from high spectral resolution instruments. For AIRS and IASI, the cost 
of assimilating radiances will be significantly greater than that of assimilating retrievals in 
a PSAS-type DAS. The number of AIRS and IASI radiance measurements for temperature 
soundings can be 50 times larger than the number of useful pieces of information for a DAS. 
We showed in Part I that a compact representation of a retrieved product can be defined from 
which the retrieval prior information has been largely removed. The information content of 
the compact retrieval is essentially the same as that of the original set of radiance measure- 
ments. Consequently, the assimilation of compact retrievals (or compressed radiances) results 
in nearly optimal analyses, while retaining some of the practical advantages of traditional 
retrieval assimilation. 

In the present paper we address the following question: how much deterioration actu- 
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ally results from a suboptimal assimilation of retrieved products, due to correlations between 
retrieval and forecast errors? Starting from the nonlinear statistical analysis equations, we 
compare the analysis errors obtained by suboptimal assimilation of retrievals ( i.e. } by ne- 
glecting to account for the cross-covariances between retrieval and background errors) with 
the errors that would result from optimal radiance assimilation. We consider interactive re- 
trievals , for which the retrieval prior estimate is identical to the background used in the 
assimilation, as a special case. The error analysis is illustrated with one-dimensional assimi- 
lation experiments using simulated data from high- and low-resolution infrared sounders. 

The outline of the paper is as follows. In section 2 we present a general error analysis 
for various assimilation methods. We first review the statistical analysis equations for non- 
linear observation operators. We then apply these equations to the error analysis of radiance 
assimilation. We briefly discuss the production of ID retrievals, followed by the error anal- 
ysis for retrieval assimilation. We then show that in the ID case, suboptimal assimilation 
of interactive retrievals is equivalent to direct radiance assimilation with a modified (and 
therefore suboptimal) gain. This result allows us to assess the impact on analysis errors of 
cross-covariances between retrieval and background errors. In section 3 we describe the con- 
figuration and results of our numerical experiments. We briefly discuss our conclusions and 
future work in section 4. 

2. Error analysis for various assimilation methods 

Here we derive approximate expressions for the analysis error covariances associated with 
the direct assimilation of radiances on the one hand and with the suboptimal assimilation 
of ID retrievals on the other. We are primarily concerned with the impact of neglecting the 
cross-covariances between retrieval and background errors in retrieval assimilation. In prac- 
tice, of course, there are many additional approximations involved in assimilating remotely 
sensed data. Minimum-variance assimilation of observations into a DAS requires the com- 
plete specification of observation and background error covariances, which are — at best — only 
approximately known. However in this section we assume that both the observation error 
covariance (including both instrument and transfer model errors) and the background error 
covariance are known. This implies the possibility of optimal direct radiance assimilation. 
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The resulting analysis error covariance can then be regarded as a lower bound or benchmark 
for other assimilation methods. 


(a) Nonlinear statistical analysis 

The objective of statistical analysis is to produce a statistically accurate estimate of 
the atmospheric state, given a set of observations and a background usually in the form of a 
short-term forecast. The variational framework (e.g., Lorenc 1986; Talagrand 1988) provides 
an estimate of the state by minimizing the functional 


J(w) = (w — w^) r (P / ) l (w - w^) + (w° — h(w)) T R ^w 0 — h(w)), (1) 

where the unknown vector w represents the 3D state of the atmosphere, is the background 
estimate (first guess), w° is the observation vector, is the background error covariance 
matrix, R is the observation error covariance matrix, and h(w) is the observation operator 
(generally nonlinear) that maps the 3D atmospheric state into observables. If the background 
and observation errors are unbiased, normally distributed, and uncorrelated with each other, 
and if the covariances and R are correctly specified, then the analysis state obtained by 
minimizing J( w) is the mode of the conditional probability density function p(w|w^ U w °) 
(Jazwinski 1970). 

The minimum of J( w) can be obtained by a quasi-Newton iteration of the form 

w, +1 =w f + K, jw° - h(w,) + H,(wi - w 7 )] , (2) 


(e.g., Rodgers, 1976) where the subscript i denotes the iteration, K is the Kalman gain 
matrix given by 


K, = P'Hf (H.P'Hf + R) 1 , 


( 3 ) 


and is a linearized version of h, 


i.e., 


H,= 


<9h(w) 


<9w 


( 4 ) 


The analysis vector, w°, is the state obtained at convergence: 


w a = lim w t . 

;->oo 


( 5 ) 
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At convergence, (2) becomes 


w 7 + K [w° - h(w a ) + H(w a - w 7 )] 
w / + K [w° - Hw ; ] - K [h(w°) - Hw a ] , 


where 


K = P f H T (UP f U T + R) 1 , (7) 

H _ 5h(w) (8) 

aw w=w° 

We will refer to equations (6-8) collectively as the nonlinear analysis equations. 

If the observation operator is linear, then the matrix H is constant and h(w) = Hw (only 
a single iteration of (2) is needed in that case). The analysis equation (6) then becomes 


. If we now consider the possibility of cross-covariance between background and observation 
errors, denoted by X, it follows that the analysis error covariance P° is 

P“ = (I — KH)p7(I — KH) t + KRK t 

+ KX(I - KH) t + (I - KH)X t K t . (10) 

This expression is valid for any gain matrix K ( e.g for the optimal gain given by (7) or any 
suboptimal gain). If background and observation errors are uncorrelated, then X = 0 and 
(10) reduces to 

P‘ = (I — KH)P / (I-KH) t + KRK t . (11) 

If, in addition, K is given by (7), then this expression further reduces to 

P B = (I-KH)P / . (12) 


In case of a nonlinear observation operator this error analysis is inexact, due to the presence 
of the term K[h(w“) — Hw“] in (6). The expressions (10-12) can be used to approximate the 
actual analysis error covariances when the linearized observation operator H is evaluated at 
w = w“, as in (8). The local accuracy of the approximations then depends on the magnitude 
of the linearization error [h(w) — Hw] at w = w“. 
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(6) Optimal direct radiance assimilation 

The observation operator associated with radiance measurements involves an approx- 
imate radiative transfer or empirical model, which we denote by /( z, b). This model can 
be used to simulate radiances given any state z. The vector b represents state-independent 
model parameters. The state variables z of the radiative transfer model are generally compat- 
ible with the state variables w of the background — in the sense that both vectors are discrete 
representations of the same geophysical quantities in the same physical domain. However, z 
and w are not necessarily defined at the same locations, so that interpolation is needed to 
change from one state representation to another. The observation operator associated with 
radiance assimilation is therefore 


h(w) = f(z, b)=f(Iw, b), 


(13) 


where X is an interpolation operator that maps forecast model state variables to the state 
representation of the radiative transfer model. The linearized observation operator H is then 


H = 


dh 

dw 


df dz 
dz <9w 


= FX , 


(14) 


with F the Jacobian of the radiative transfer model. The nonlinear analysis equations (6-8) 
applied to radiance assimilation are therefore 


w a 

K y 

F 


+ K y [y - FX vf f ] - K y [f (X w“, b) - FX w a ] , 
F S X t F t (F X P f l t F t + R y ) _1 , 


df 


dz 


z=Iw°’ 


(15) 

(16) 
(17) 


where y is a vector of radiance measurements, and R y is the radiance (or equivalent bright- 
ness temperature) error covariance accounting for both instrument error and transfer model 
error, as discussed in Part 1, by Eyre et al. (1993), and by Rodgers (1990). If the assumption 
holds that radiance and background errors are uncorrelated, then the linear approximation 
(12) applies. The analysis error covariance for optimal direct radiance assimilation, therefore, 
is approximately 


P a « (I - K y FX )P y = (I - K y FX )P 7 (I - K y FX ) T + K y R y (K y ) T . 


(18) 
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The accuracy of this approximation depends on the size of the transfer model lineariza- 
tion error [f(z, b) — f(z a , b) — F(z — z a )] at z = X w a . The expression (18) serves as a lower 
bound for other, suboptimal, assimilation methods. 

(c) Production of optimal ID retrievals 

A satellite-based remote sounding instrument measures radiances in a number of spectral 
intervals for each pixel in the instrument field-of-view. For both nadir and limb viewing 
instruments, these radiances can then be used to estimate (or retrieve) a vertical profile of 
atmospheric parameters such as temperature or humidity. A prior state estimate is needed 
to supplement the measurement information if the observing system does not completely 
resolve the vertical structure of the profile. The physics of radiative transfer generally make 
nadir viewing instruments insensitive to the high frequency components of the atmosphere’s 
vertical structure. Therefore, retrievals produced from nadir sounding microwave and infrared 
instruments such as the TOVS may include a significant amount of information from the 
prior estimate. 

The retrieval process is analogous to the general data assimilation problem described 
in section 2. That is, the retrieval z r is a state estimate obtained by combining radiance 
measurements y with a prior state estimate (or background) z p , by means of an estimator 

D: 

z r = D (y, b, z p ) . (19) 

The retrieval z r can be regarded as a one-dimensional analysis of the atmospheric state. 
In practice (19) is solved repeatedly, using different subsets of the radiance observations, 
to produce a set of vertical profiles defined at the horizontal locations within the satellite 
swath. 

Errors associated with ID retrievals defined at different locations are not independent. 
It can be shown ( e.g. } Part I) that 

R z « (I - D„F) P p (I - D y F) T + (20) 

is a linear approximation to the retrieval error covariance, where 
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and P p is the error covariance associated with the prior state estimate. The latter involves 
horizontal as well as vertical correlations, and (20) therefore shows that the errors in retrievals 
at different locations must be correlated as well. Note the analogy between this expression 
for the retrieval error covariance R* and (11); see also Eyre (1987) and Rodgers (1990). 

So far we have not made any assumptions about the nature of the retrieval algorithm, 
symbolically expressed by the operator D in (19). Given the prior estimate z p and inde- 
pendent data y, the optimal nonlinear one-dimensional retrieval z r minimizes the likelihood 
functional 


J(z) = (z - z p ) T (P p )- 1 (z - z p ) + (y - f(z, b)) r (R y )~ 1 (y - f(z, b)). (22) 

The analogy with (1), which is a three-dimensional version of (22), is clear. The nonlinear 
analysis of the previous sections can be applied here as well, and so it follows that the optimal 
nonlinear ID retrieval satisfies 


z r = z p + By [y - Fz p ] - D y [f(z r , b) - Fz r ] , 

(23) 

D y = P p F t (FP P F T + R y ) _1 , 

(24) 

*3 

II 

z—z r 

(25) 

The error covariance of the optimal lD-retrieval is approximately 



R z « (I - DyF)P p . 

(26) 


The accuracy of this approximation depends on the size of the transfer model linearization 
error [f(z, b) — Fz] at z = z r . 

In practice the retrieval error covariance is not computed by either (20) or (26), but 
rather modeled and/or estimated directly. Da Silva et al. (1996) provide empirical evidence 
for the presence of both horizontally correlated and uncorrelated retrieval error components, 
consistent with the two terms in (20). They also show how one can estimate the variances 
of both components, as well as the decorrelation length of the horizontally correlated com- 
ponent, based on the output of a DAS. 


(i) Interactive retrievals 
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Interactive retrievals are produced by taking the same background used by the DAS 
( i.e a current short-term forecast) as the prior state estimate in the retrieval process. Then 

z p = Iw^ (27) 

and consequently 

P p = IP / I r . (28) 

Substitution into (23-25) defines the optimal interactive ID retrieval as 


z r = lw f + D y [y-FIw'] - D y [f(z r , b) - Fz r ] 

(29) 

D v = ip / i t f t (fip / i t f 7 ’ + r ! ')" 1 , 

(30) 

d f 



F = di 

z=z r 

(31) 


Using (26) and (28), the retrieval error covariance R z is approximately 

R*«(I-D J ,F)IP / 2 T . (32) 

It follows directly from the linear part of (29) that the retrieval/background error cross- 
covariance X is approximately 

X » (I - D„F)X P f . (33) 

Note that (32, 33) together imply 

R 2 «X2 r , (34) 

which would be exact in case of a linear radiative transfer model f. From (34) it is clear 
that, in the general, nonlinear case, the retrieval-forecast error cross-covariance can be of the 
same order of magnitude as the covariance of the retrieval error itself. 

( d) Retrieval assimilation 

In traditional retrieval assimilation the retrievals z r axe simply treated as observations 
of the atmospheric state w. The observation operator is then linear: 


h(w) = I w, 


(35) 
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since it merely involves interpolation from the forecast model state representation to the 
retrieval state representation. The analysis is then simply 

w a = w 7 + K 2 [z r -Jw 7 ] , (36) 

where K 2 is the gain matrix for retrieval assimilation, which we will now examine more 
carefully. 

Although the error analysis for retrieval assimilation is linear, it is complicated by the 
fact that the retrieval errors partly depend on the errors in the prior state estimate used in 
the retrieval process. It is likely, even in the case of non-interactive retrievals, that errors 
in the prior estimate are correlated with errors in the forecast wA This could be caused, 
for example, by a common dependence of the estimation errors on the current atmospheric 
state. Therefore one has to assume in general that the retrieval errors are correlated with the 
forecast errors as well. Given a retrieval-forecast error cross-covariance X, it can be shown 
that the optimal gain (in the linear minimum-variance sense) is given by 

K 2 ° = (P 7 I T — X T ) (lP / I T + R z -lX r -XI T )"‘ . (37) 

In practice, X is usually neglected because it is difficult to estimate; see, however, da Silva 
et al. 1996. Furthermore, numerical solution of the analysis equations using (37) is com- 
plicated when the cross-covariance terms are large, because the matrix K*° then becomes 
ill-conditioned. Eyre et al. (1993) used the approach of Lorenc et al. (1986) to control the 
associated numerical instabilities, by mapping the ID retrievals into a reduced space and 
then modifying both the retrievals and their error variances appropriately. 

The (suboptimal) gain K* 5 ° obtained by neglecting X in (37) is 

K 2jo = P 7 I t (jP / I t + R 2 ) -1 . (38) 

Assimilation of retrievals using a gain matrix of this form has been implemented operationally 
in a number of data assimilation systems (Goldberg et al. 1993; Susskind and Pfaendtner 
1989). 

We now examine the analysis equations for the assimilation of retrievals with an arbi- 
trary gain matrix K 2 . Combining (23-25) with the retrieval analysis equation (36) gives 

w a = w 7 T K 2 [z p + D 9 (y - Fz p ) - 1 w 7 ] - K 2 B y [f (z r , b) - Fz r ] , (39) 
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Lacking an explicit relationship between the prior state estimate z p used in the retrieval 
process and the forecast w-f, equation (39) cannot be further simplified. Based on the linear 
terms in (39), an approximation for the analysis error covariance is given by 

P a « (I - K Z 1 ) P f (I - K Z 1 ) T 

+ (K z - K z D y F) P p (K z - K z D y F) r 
+ (K z D y ) R y (K z D y ) r 
+ (I - K Z 1 ) P p/ (K z - K 2 D„F) t 

+ (K z - K z D y F) P p/ (I - K Z I ) T . (41) 


The first three terms in (42) involve error covariances of the forecast, radiance observations, 
and the prior estimate for the retrieval, respectively. The last two terms involve the cross- 
covariance P pjf between prior estimation errors and forecast errors. 


(i) Assimilation of interactive retrievals 

Next we specialize to assimilating interactive retrievals, first with an arbitrary gain 
matrix K 2 . Combining (29-31) with the retrieval analysis equation (36) gives 


w a = w' + K z D y [y - FI w'] - K z D y [f (z r , b) - Fz r ] , 
D y = 1 V s X t F t (FI V s X t F t + R y ) _1 , 


F = 


df 

dz 


(42) 

(43) 

(44) 


Comparison with the nonlinear analysis equations (15-17) for direct radiance assimilation 
shows precisely the sense in which the assimilation of interactive retrievals can be regarded 
as a suboptimal form of direct radiance assimilation. 

First, note that the Jacobian F is evaluated at z = Iw a for radiance assimilation but 
at z = z r for interactive retrieval assimilation. This discrepancy is strictly due to the non- 
linearity of the radiative transfer model f(z,b). Second, the gain matrix K y for radiance 
assimilation is replaced by K 2 D y for retrieval assimilation. This modifies the linear terms 
of the analysis equation and therefore represents the most significant difference between 
radiance assimilation and retrieval assimilation. 
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Let us now assume that the nonlinear component of the radiative transfer model is 
small, i.e., 

f(5z, b) * FSz (45) 

for a constant matrix F. This linearity assumption cannot be expected to be uniformly valid 
(i.e., for all possible retrieval states z), but it should be reasonably accurate locally (i.e., for 
z in some neighborhood of z =Iw“). Using (46) the linearized radiance analysis equations 
(15-17) are 

w a = w ; + K y [y-FIw 7 ] , (46) 

K y = P-^Z T F r (FX P^Z t F t + R y ) _1 , (47) 

On the other hand, the linearized interactive retrieval analysis equations (43-45) are 

w“ = + K y *° [y — FI w^j , (48) 

K y,o _ K Z I K y . (49) 

The matrix factor K 2 Z multiplying K y in (50) reflects the fact that, in general, the as- 
similation of retrieved products amounts to a suboptimal use of radiance data. A linear 
approximation for the analysis error covariance associated with interactive retrieval assimi- 
lation based on (49) is 

P“ « (I - K yso FZ )P f (I - K y> °FZ ) T + K yso R y (K ys °) T . (50) 

Note that this expression does not involve the retrieval-forecast error cross-covariance X. 
To assess the (linear) effect on the analysis error of, for example, neglecting the error cross- 
covariance terms, (51) can be compared with (18) for optimal direct radiance assimilation. 

Consider, for the moment, the optimal retrieval gain K 2 = K Zo given by (37). Under the 
linear approximation it follows from (34) that 

K 2 ° = (P / I T -X T ) (IP J 1 T -1X T )~ 1 . (51) 

This expression shows that (47) and (49) would be identical but for the appearance of 
the interpolation operator I in several places. This proves the linear equivalence between 
optimal radiance assimilation and optimal assimilation of optimal ID retrievals, apart from 
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interpolation effects. As mentioned earlier, however, the optimal retrieval gain is impractical 
from a computational point of view. Using (34) again we have the alternative expression 

K-'° = (P f l T - X T ) (lP f l T - R z ) _1 . (52) 

The second matrix factor on the right-hand side is difficult to invert, unless all its eigenvalues 
are bounded away from zero. This condition is violated whenever the observing system does 
not completely resolve the vertical structure of the profile, because in that case there is at 
least one mode for which the retrieval accuracy is comparable to the forecast accuracy. 

Of more practical interest is the following analysis for the suboptimal retrieval gain 
K 2 = K 2 *° defined by (38), which was obtained by neglecting the retrieval-forecast error cross- 
covariances. We consider two extreme cases when (1) the retrievals are completely determined 
by the radiance observations alone, or (2) the retrievals depend exclusively on the forecast, 
which is the prior state estimate used in the interactive retrieval process. Substituting (32) 
into (38), we obtain 

K z ‘°l = P } X T (l P f I T + (I - D y F)I P f l T ) _1 1 . (53) 

Note that K Zso Z is the matrix factor that modifies the optimal gain for the radiance data; 
see (50). The linear part of the interactive retrieval equation (29) can be written 

z r = [I-D y F] Iw / + D v y. (54) 

If the state is overwhelmingly determined by the radiance observations, then D y F ~ I, z.e., 
the retrieval is almost independent of the prior estimate (see Part I). Equation (54) 
then shows that the difference between radiance assimilation and retrieval assimilation is 
due only to the appearance of the interpolation operator Z ; neglecting interpolations we 
have K Zso Z ^ I. This shows, not unexpectedly, that in this case the effect of ignoring the 
cross-covariance terms in the retrieval assimilation is negligible. 

In the other extreme, suppose that the radiance observations contain virtually no in- 
formation. Then D y F 0, and (54) then implies that, ignoring interpolation effects, the 
radiance data are assigned only half as much weight as they should be. On the other hand, 
(48) implies that the optimal weights for the radiance data are very small to begin with 
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in this situation, because the radiance errors are so large. Therefore the difference between 
optimal radiance assimilation and suboptimal retrieval assimilation is negligible in this case 
as well. 

The preceding argument applies to each individual mode of the retrieved state. This 
implies that the impact of ignoring the cross-covariance terms in interactive retrieval assimi- 
lation should be largest for modes that are determined partly by the observations and partly 
by the forecast information. 

3. One-dimensional simulation results 

We compare the analysis errors for one-dimensional optimal radiance assimilation with 
those for several suboptimal retrieval assimilations, using simulated Jacobians for two differ- 
ent infrared sounders: the Atmospheric InfraRed Sounder (AIRS) and the High-resolution 
InfraRed Sounder 2 (HIRS2). HIRS2 has flown continuously on polar-orbiting satellites from 
1978 to the present as part of the TIROS Operational Vertical Sounder or TOVS (see Smith 
et ai 1979). HIRS2 has 19 infrared channels, a single spot ground resolution at nadir of 
17.4 km and scans cross-track ±49.5° from nadir. AIRS is an advanced sounder with over 
2000 channels that will fly on the NASA EOS PM platform in the year 2000 (Aumann and 
Pagano, 1994). AIRS has similar spatial resolution and coverage as HIRS2, but the spectral 
resolution is more than an order of magnitude greater. 

We focus here on a single aspect of data assimilation for infrared sounders, namely the 
temperature profile information contained in the radiances. The simulated HIRS2 channel 
set includes 11 of the 20 channels (channels 1-7 and 13-16). These are affected mainly by 
CO 2 absorption and are typically used for temperature soundings. The AIRS channel set 
includes all 550 available channels between 650 and 742 cm” 1 , between 2160 and 2270 cm -1 , 
and between 2379 and 2407 cm” 1 . These are the same channel sets used in Part I and we also 
prescribe the same instrument specified equivalent noise temperatures as in Part I. Some of 
the HIRS2 and AIRS channels are affected by water vapor absorption and/or the surface 
skin temperature and emissivity, but for simplicity we assume these variables to be known. 

As in Part I, the Jacobian F for each instrument is computed using a fast radiative 
transfer algorithm based on parameterizations similar to the ones described in Susskind 
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et al. (1983). The linearized observation operator H is equal to F, because for these one- 
dimensional experiments 1=1. Radiance errors for different channels are assumed indepen- 
dent, with variances equal to the sum of the squared channel equivalent noise temperatures 
(NEAT) plus an additional (0.1K) 2 to account for linearization error. For simplicity, the 
radiative transfer model is taken to be perfect, and we assume clear-sky night-time (i.e., 
no reflected solar radiation) and nadir-viewing conditions. These simulations are sufficiently 
realistic to provide a meaningful comparison between the different assimilation approaches; 
in particular, the same simplifying assumptions are made in all cases. 

We specify a thickness forecast error covariance Pf for our experiments at 18 pressure 
levels (0.4, 1, 2, 5, 10, 30, 50, 70, 100, 150, 200, 250, 300, 400, 500, 700, 850, and 1000 hPa) 
based on the Goddard Earth Observing System Data Assimilation System (GEOS DAS) 
6-hour forecast height error covariances. These were estimated from time series of North- 
American rawinsonde observed-minus-forecast residuals using the method described in Dee et 
al. (1998a, b). Horizontal forecast error correlations do not play a role in these experiments. 
Retrieval error covariances originally specified for temperature have been hydrostatically 
converted to thickness error covariances. 

For radiance assimilation experiments we use the linearized analysis equations (47, 48), 
and estimate the analysis errors using (18). For interactive retrieval assimilation we use (36, 
38), specify retrieval error covariances according to (32, 30), and estimate the analysis errors 
using (51). In Part I we showed by means of Monte Carlo simulations that the linearized ex- 
pressions for the analysis error covariances approximate the errors for this particular problem 
quite well, although the actual errors are slightly underestimated. 

(a) Interactive retrieval assimilation 

(i) Using correct retrieval error covariances 

Figure 1 shows the estimated thickness error standard deviations (in m), as a function 
of pressure level, for radiance assimilation (solid curves) and for interactive-retrieval assim- 
ilation (dashed curves), using either AIRS or HIRS. For reference, the prescribed forecast 
error standard deviations are shown in the figure as well (dashed-dotted curve). Since the 
error covariances are correctly specified for this experiment, interactive-retrieval assimilation 
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is suboptimal only because the cross-covariances between retrieval errors and forecast errors 
are not accounted for. The error standard deviations are obtained from the diagonal of the 
analysis error covariance P a computed for each case. The figure shows that the analysis 
error standard deviations for the two assimilation methods are virtually indistinguishable. 
Not shown are the thickness analysis error vertical correlations, which are also nearly iden- 
tical for the two methods. To gain some insight into this result, we examine separately the 
contributions to the analysis error covariances of the forecast errors and of the radiance 
errors. 

We project the two components of the analysis error covariance onto the eigenvectors 
of F t (R^)- 1 F, which are the columns of the unitary matrix U in 

F t (R^)" 1 F = UDU t , (55) 

with D a diagonal matrix of eigenvalues. This transformation w r as used in Part I to produce 
compact Partial Eigen-Decomposition (PED) retrievals. The eigenvectors for the two instru- 
ments are shown in Figure 2 in order of decreasing eigenvalue, that is to say, in order of 
increasing uncertainty. Accordingly we can define 

A - U T (I - KF)P f (l - KF) r U (56) 

and 

B = U T KR y KU, (57) 

corresponding to the two terms in (18) and (51). The matrix A represents the forecast error 
contribution, and B the radiance error contribution, to the analysis error covariance. Figure 3 
shows the diagonal elements of these tw r o matrices on a logarithmic scale, for the optimal 
(radiance assimilation) case with K = K y given by (48) and the suboptimal (interactive- 
retrieval assimilation) case with K — K yao given by (50) and (38). 

Figure 3 shows that the interactive-retrieval assimilation effectively assigns too much 
weight to the forecast and too little to the radiance data. The leading 7 modes are well 
determined by the radiance data, so that the analysis errors for these modes are dominated 
by the radiance errors. The slightly increased weight given to the forecast therefore does not 
greatly affect the analysis in the leading modes. For the trailing 7 modes the situation is 
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reversed: information from the forecast is dominant, and decreasing the weight given to the 
radiance data likewise does not significantly affect the analysis. For modes in between these 
two extremes (inodes 8 and 9), the influence of the information in the forecast is comparable 
to that in the radiances. The change in relative weights in these modes is therefore respon- 
sible for most of the analysis degradation in interactive-retrieval assimilation. As shown in 
Figure 1, however, the overall degradation as measured by analysis error standard deviations 
in physical space is insignificant. 

Figure 4 is similar to Figure 3, but uses the Jacobian and error covariances for the HIRS 
instrument. The difference in weights in the cross-over modes (modes 2-4) appears to be more 
severe for HIRS than for AIRS. However, as shown in Figure 1, the overall degradation in 
the suboptimal analysis is small in this case as well. 

Table 1 shows the condition numbers of the innovation covariance matrices (i.e., the 
quantity to be inverted when solving the analysis equation) for radiance assimilation and for 
optimal and suboptimal retrieval assimilation for AIRS and HIRS. The numerical condition- 
ing of the analysis equations is slightly better for suboptimal retrieval assimilation than for 
radiance assimilation. This implies that solving the analysis equations (in the PSAS context) 
will be somewhat more efficient for suboptimal retrieval assimilation than for radiance as- 
similation. The condition numbers for the innovation covariance associated with the optimal 
retrieval assimilation gain matrix (37) are very high implying near singularity. This result 
is expected as explained in section 2 and by Eyre et al. (1993) and suggests that it will not 
be possible to assimilate retrievals from nadir-viewing instruments such as xAIRS and HIRS 
with an optimal gain matrix. 

(ii) Using incorrect retrieval error covariances 

We now examine the effect of specifying incorrect retrieval error covariances in the 
assimilation. This would occur in practice, for example, if the DAS employs a homogeneous 
retrieval error covariance model, even though actual retrieval errors are state-dependent. 
Equations (32, 30) show how the interactive retrieval error covariances depend on the forecast 
and brightness temperature error covariances, as w r ell as on the Jacobian of the' radiative 
transfer model. The latter is state-dependent due to the nonlinearity of the Planck function, 
while the brightness temperature errors depend on scene brightness temperature. A colder 
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scene brightness temperature corresponds to a higher equivalent noise temperature. For 
example, the HIRS2 equivalent noise temperatures for tropical and mid-latitude profiles 
differ by factors ranging from about 0.8 to 1.6 depending on the channel. 

For these experiments we specify the interactive retrieval error covariances using (32, 
30) as before, but with Jacobians and brightness temperature error covariances computed for 
three different model-generated profiles, corresponding to a low, middle, and high-latitude 
case. These profiles are described in more detail in Part I. We then assimilate, for example, 
interactive retrievals in the tropics using the retrieval error covariances computed for the 
mid-latitude profile. The analysis is then suboptimal, not only because cross-covariances 
between retrieval errors and forecast errors are ignored, but also because the retrieval error 
covariances are misspecified. We can still estimate the analysis error standard deviations for 
these cases, by means of (51) with the gain matrix defined by (50,38). 

Figure 5 shows the estimated thickness error standard deviations for the tropical as- 
similation with AIRS and HIRS, with incorrect error covariances based on the mid-latitude 
profile. Solid curves correspond to (optimal) radiance assimilation, and dashed curves to 
the (suboptimal) retrieval assimilation. The dotted-dashed curve indicates the forecast error 
standard deviations. The differences between the analysis errors for the optimal and sub- 
optimal assimilations are insignificant. We obtain similarly small differences for all other 
profile combinations. These results indicate that, for these one-dimensional simulations, the 
analyses are not sensitive to small misspecifications of the retrieval error covariance. In the 
previous section we showed that, in certain regimes, a misspecification of the errors (e.g. t 
neglecting retrieval/background cross-covariance) does not significantly harm the analysis. 
The results of this section imply that, in addition, a relatively small misspecification of the 
retrieval error covariance also does not significantly degrade the suboptimal retrieval assimi- 
lation. This result supports the use of homogeneous retrieval error covariances for interactive 
clear-sky temperature retrieval assimilation. 

( b ) N on-interactive retrieval assimilation 

In order to simulate analysis errors that would obtain with non-interactive retrieval 
assimilation, we need to make assumptions about the accuracy of the prior state estimate 
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used for the retrieval, and about the cross-covariances between prior estimation errors and 
the forecast errors; see (42). We are interested in the situation where a forecast model 
from an older or different DAS or from some other source such as climatology is used as 
the prior information for the retrieval. For this experiment we take the prior estimation 
error covariances to be the same as the forecast error covariances, except that the error 
variances are multiplied by a factor a 2 . To model the forecast- prior error covariances, we 
multiply the covariances that would result if the errors were perfectly correlated by a factor 
7 . Thus, a = 1, 7 = 1 corresponds to interactive retrieval assimilation. As 7 — > 0 the analysis 
errors may become smaller than those obtained with direct radiance assimilation, because 
the prior state estimate then provides another independent source of information for the 
assimilation. In reality, prior estimation errors and forecast errors are likely to be highly 
correlated. As 7 — > 1 when a > 1 , the analysis should degrade as the prior state estimate, 
which then contains no additional information over the forecast, is assigned too much weight. 

The dashed curves in Figure 6 are the estimated analysis errors for the case a = 1.5,7 = 
0.75. As before, solid curves correspond to (optimal) radiance assimilation, and the dotted- 
dashed curve indicates the forecast error standard deviations. At some altitudes, the HIRS 
analysis errors actually do exceed the forecast errors. Where the information content of the 
radiances is high, such as in the lower troposphere, the degradation with respect to the 
optimal analysis is small. 

Figures 7 and 8 show the same curves but now with 7 = 0.50 and 7 = 0.25, respectively. 
This corresponds to an increase in the amount of independent information contained in 
the prior state estimate for the retrieval. As expected, the results improve as 7 decreases; 
in fact, when 7 = 0.25 the analysis errors are smaller than those obtained with radiance 
assimilation at almost every altitude. Finally, Figure 9 shows the results for a = 2 . 0 , 7 = 
0 . 75 , corresponding to the use of a relatively inaccurate prior state estimate that is highly 
correlated with the forecast. Clearly the results are much worse in this case. 

4. Conclusions and Future Work 

We set. out in this paper to compare different ways of utilizing satellite data, either 
by directly assimilating radiances in a variational framework, or by first producing one- 
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dimensional retrievals and then assimilating the retrievals. Actual implementation of either 
method in an operational data assimilation system involves numerous technical details, per- 
taining to quality control, systematic error correction, and covariance tuning. This begs 
the question whether the recent improvements in forecast skill obtained by centers that 
implemented direct radiance assimilation, is due to the change in methodology, or a re- 
sult of various implementation details. In any case, computational and logistical arguments 
favor some form of retrieval assimilation for future high-volume data types especially for 
PSAS-like assimilation systems. It is therefore important to learn as much as possible about 
the expected analysis errors for various suboptimal assimilation schemes, and to investigate 
whether any negative effects of retrieval assimilation are actually significant in view of the 
many uncertainties inherent in any data assimilation method. 

We presented a theoretical error analysis of the various assimilation methods: direct 
radiance assimilation, interactive retrieval assimilation, and non-interactive retrieval assim- 
ilation. As has been pointed out elsewhere, interactive retrieval assimilation amounts to a 
suboptimal use of radiance data because cross-covariances between the retrieval and back- 
ground errors are not accounted for in the assimilation. We showed that, in fact, interactive 
retrieval assimilation is linearly equivalent to radiance assimilation with modified (hence 
suboptimal) analysis weights. We then showed that the resulting degradation of analysis 
accuracy is small for vertical modes that are determined either by the radiances or by the 
model forecast alone, but that the degradation can be significant for modes that are not well 
determined by either. 

These results were further clarified with a number of one-dimensional numerical ex- 
periments, for which we simulated radiance data from two different infrared sounders: the 
Atmospheric InfraRed Sounder (AIRS) and the High-resolution InfraRed Sounder 2 (HIRS2). 
We found that the degradation of analysis errors due to the assimilation of interactive re- 
trievals, rather than radiances, is insignificant in the context of these experiments. Moreover, 
when we misspecified retrieval error covariances in the retrieval assimilation, the degradation 
was still small. We also reported results from several experiments with the assimilation of 
non-interactive retrievals, using different assumptions about the accuracy of the prior state 
estimate used in the retrieval process, and about the cross-covariances between the prior 



ASSIMILATION OF REMOTELY-SENSED DATA 


23 


estimation and forecast errors. We found that successful assimilation of non-interactive re- 
trievals requires that the accuracy of the prior state estimates used for the retrievals must 
be at least comparable to that of the forecast. If not, then the analysis may turn out signif- 
icantly worse than in the case of either direct radiance or interactive retrieval assimilation. 
For an instrument that provides only a small impact at best, as is the case for TOVS in the 
Northern hemisphere, assimilation of retrievals based on inferior prior state estimates may 
actually produce analyses that are less accurate than the forecast itself. 

Our conclusions are based on theoretical considerations combined with simple one- 
dimensional simulations. We would like to show in future simulations that similar conclusions 
hold in three dimensions, when horizontal correlations of forecast errors play a role as well. 
We also plan to include multiple data types in our simulations and finally to compare different 
assimilation strategies with real data in a full data assimilation system. 
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TABLE 1. Condition numbers for the innovation covariance matrix 

AIRS HIRS 

Radiance assimilation 3.25xl0 3 7.17x10“ 

Retrieval assimilation, neglect X (sub-optimal) 5.63x10“ 6.39x10“ 
Retrieval assimilation, account for X (optimal) 7.59xl0 5 7.11xl0 8 



Thickness Error 

Figure 1. Thickness analysis error standard deviations (in m) for optimal radiance assimilation (solid lines) 
and for interactive retrieval assimilation (dashed lines), using simulated AIRS and HIRS data. Forecast error 
standard deviations are shown for reference (dot-dashed line). 
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Figure 2. Leading eigenvectors and eigenvalues of F T (R y ) l F for AIRS (solid line, Ai) and HIRS (dashed 

line, A 2 ) for a mid-latitude profile. 
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Figure 3. Forecast and radiance contributions to the analysis error variances, projected onto the eigenvectors 

of Figure 2, for simulated AIRS data. 
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Figure 4. As Figure 3, for simulated HIRS2 data. 



Figure 5. As Figure 1, for a simulated tropical profile. Error covariances for the retrieval assimilation were 

incorrectly specified as for a mid-latitude profile. 
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Figure 6 . As Figure 1, but for non-interactive retrieval assimilation, using a — 1.5, 7 = 0.75 for defining the 

error covariances. 



Figure 7. As Figure 1, but for non-interactive retrieval assimilation, using a = 1.5, 7 = 0.50 for defining the 

error covariances. 
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Figure 9. 





